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Abstract. We obtain large deviations theorems for both discrete time ex- 
pressions of the form ^2^— 1 F(X(qi(n)),...,X(q((n))^ and similar expres- 
sions of the form J Q T F(X(qi(t)), . . . , X(qe(t)))dt in continuous time. Here 
X(n),n > or X(t), t > is a Markov process satisfying Docblin's condi- 
tion, F is a bounded continuous function and qi(n) = in for i < k while for 
i > k they are positive functions taking on integer values on integers with some 
growth conditions which are satisfied, for instance, when q^'s are polynomials 
of increasing degrees. Applications to some types of dynamical systems such as 
mixing subshifts of finite type and hyperbolic and expanding transformations 
will be obtained, as well. 



1. Introduction 

Nonconventional ergodic theorems which attracted substantial attention in er- 
godic theory (see, for instance, [2] and [13]) studied the limits of expressions having 
the form 1/N£„=i T 9l( ")/i ■ • • T«W f t where T is a weakly mixing measure pre- 
serving transformation, f^s are bounded measurable functions and q^s are polyno- 
mials taking on integer values on the integers. While, for instance, [2] and Q3] were 
interested in L 2 convergence, other papers such as 1 provided conditions for almost 
sure convergence in such ergodic theorems. Originally, these results were motivated 
by applications to multiple recurrence for dynamical systems taking functions fi 
being indicators of some measurable sets. 

Introducing stronger mixing or weak dependence conditions enabled us in |22j 
to obtain functional central limit theorems for even more general expressions of the 
form 

1 im 

(1.1) -= (F(X( qi (n)), X(q e (n)) - P) 

* 71— 1 

where X(n), n > is a sufficiently fast mixing vector valued process with some 
moment conditions and stationarity properties, F is a locally Holder continuous 
function with polinomial growth, F = J Fd{n x ■ ■ ■ x /i) and /i is the distribution 
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of X(0). In order to ensure existence of limiting variances and covariances we had 
to impose certain assumptions concerning the functions qj(n), j > 1 saying that 
there exists an integer k > 1 such that qj(n) — jn for j = 1, k while qj(n), j > k 
are positive functions taking on integer values on integers with some (faster than 
linear) growth conditions. 

The next natural step in the study of limiting behavior of nonconvcntional 
sums Sn = ^2n=i ^(^(9i( n ))i •••) X(qe(n))) is to obtain large deviations estimates. 
Namely, we will be interested in this paper in the asymptotical behavior as N — > oo 
of probabilities 

(1-2) P{^S N eT} 

for various (open or closed) sets T C M. According to [T^j under appropriate 
conditions jjSn converges with probability one as N — >• oo to F = J Fdfi x • • ■ x fj, 
where [i is the common distribution of X(n)'s. Thus, as usual, (|1. 21) describes 
deviations of jjSn from the limit in the law of large numbers. 

The study of asymptotics of probabilities in (jl.2p leads to what is usually called 
the first level of large deviations. We will study also second level large deviations 
estimates which means in our setup to consider occupational measures 

1 N 

(L3) ^ N= N^ S (x {gi (n)),...,X M n))) 

n=l x ' 

and to study the asymptotical behavior as N — > oo of probabilities P{(n € U] 
where U is a subset in the space of probability measures on a corresponding product 
space. In addition, we will consider also large deviations in the averaging setup, 
namely, for the " slow" variable 5 e (n) = E% (n) given by a difference equation of the 
form 
(1.4) 

E s (n + 1) = S £ (n) + eF(3 s (n), X( Ql (n)), X(q f (n))) , n = 0, 1, .... H|(0) = x 

which is actually a generalization of the above since if F(£,xi, ...,xt) does not de- 
pend on £ then S«(iV) = j^Sjv We will deal also with continuous time versions 
of the above results considering St = F(X(qi(t)), X(qi(t)))dt for some sto- 
chastic process X(s), s > 0. 

As for conventional sums {I = k = 1) meaningful large deviations estimates can 
be obtained only for some specific classes of stochastic processes and dynamical 
systems. In our more general situation we also assume that in the probabilistic 
setup X(n), n — 0, 1, ... is a Markov chain satisfying a (strong) Docblin condition 
while in the dynamical systems setup we can consider X(n) = X(n,u>) = f(T n u>) 
where T is either a mixing subshift of finite type or a hyperbolic diffeomorphism 
or an expanding transformation and / is a Holder continuous (vector) function. 
In the continuous time case we take the underlying process X(t) to be in the 
probabilistic setup either an irreducible finite Markov chain with continuous time 
or a nondegeneratc diffusion on a compact manifold while in the dynamical systems 
setup we can take X(t) = X(t,u>) — /(T*w) where T*, t > is a hyperbolic flow 
on a compact manifold and / is a Holder continuous (vector) function. 

We will show that it is not difficult to reduce the problem to the case 
k = £ and the major problems arise only in dealing with random variables 
X(n),X(2n), ...,X(kn). When k = 1 the above reduction leads to the standard 
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(conventional) setup of large deviations. When k > 1 then the general case of 
Markov sequences requires a quite elaborate technique and a lengthy proof and it 
will be treated in another paper while here when k > 1 we restrict ourselves to 
independent identically distributed (i.i.d.) sequences X(n),n > which, unlike in 
the conventional setup, is still nontrivial. 

Both probabilistic and dynamical systems setups are united by common ideas 
and motivations but their machineris arc quite different and by this reason most of 
this paper deals with the probabilistic setup and only in the last Section[5]we discuss 
some of dynamical systems results which especially can benefit readers familiar with 
this field. 

2. Preliminaries and main results 

We start with the probabilistic discrete time setup where the underlying process 
X(0), X(l), X(2), ... is a Markov chain defined on a probability space (Cl,J-,P) 
and evolving on a Polish measurable space (M, B) as its phase space. We assume a 
"strong" Doeblin condition saying that for some integer uq > 0, a constant C > 
and a probability measure v on M the no-step transition probability P(no, x, •) of 
the above Markov chain X satisfies 

(2.1) C~ l v{G) < P(n , x, G) < Cv{G) 

for any x £ M and every measurable set G C M. It is well known (see, for instance, 
[8] ) that (|2.1[) implies existence of a unique invariant measure fi of the Markov chain 
X and the equality fJ,(G) = J d/j,(x)P(n,x,G) yields that 

(2.2) C' 1 < ^(x) =p(x) < C 

dv 

where d/i/dv denotes the Radon- Nikodim derivative. 

In all cases our setup includes also a bounded measurable function F = 
F(xi,X2, —,xe) on the £-times product space M l = M X ••• X M. The setup 
becomes complete with introduction of positive increasing functions qj, j = !,...,£ 
taking on integer values on integers and such that 

(2.3) Qj( n ) = 3 n f° r j = 1j •••) & an d some k < £ 
while for j = k + 1, ...,£ and any 7 > 0, 

(2.4) lim (qj(n) — qj(n — 1)) = 00 and liminf (9,(771) — g 7 _i(n)) > 0. 
For any function W on M l we denote by W the function on M defined by 

(2.5) W(x) = J exp(W(x, X2, xi))d^,{x2)---d^{xi). 

As usual we denote by P x the probability conditioned to X(0) = x and by E x the 
corresponding expectation. Now, we can formulate our first result. 

2.1. Theorem. Let W\{x%, xi), X € (—00,00) be a differentiable in A family of 
bounded measurable functions on M such that dW\(xi, xg)/d\ is bounded for 
each X, as well. Assume that k = 1 in 12. 3\) and |i?.^| ). Then for any x £ M the 
limit 

1 N 

(2.6) Q(Wx)= lim - \nE x cxp ( V W x {X{ qi (n)), X(q e (n)))) 

N— s-oo IV c — ' 
n=l 
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exists, it is independent of x and it is differentiable in X. In fact, Q(W\) = lnr(W\) 
where r(W) is the spectral radius of the positive operator R(W) acting by 

(2.7) R{W)g{x) = J P(x,dy)g(y)W(y). 

Furthermore, set W\(x±, X{) = \F(x\, Xi) and 

(2.8) J( u ) — sup(Au — r(W\)), n£l. 

A 

Then for any closed set K C K, 

(2.9) limsup — lnP{ — S N G K] < - inf J{u) 
and for any open set U C R, 

(2.10) liminf 4 lnP{^-S N e U} > - inf J(u) 

where, as before, S N = S N (F) = ^2 n=1 F(X(qi(n), ...,X(qi(n))) . 

We observe that a very particular case of Theorem 12.11 when {X(n), n > 0} 
are i.i.d. random variables was considered in Section 6 of [T5]. Next, we describe 
the second level of large deviations in the nonconventional setup which deals with 
occupational measures Cat on M e given by (|1.3[) where M is assumed to be a compact 
space and S z is the unit mass concentrated at z. For any probability measure 77 on 
M e define 
(2.11) 

f E Xl fu(X (l),x 2 , ...,x e )dfi{x 2 )...dfi{x e ) 
lyn) = — mt / In dmxi, Xi) 

u£C+(MC)J M t u(Xi,...,Xi) 

where C+(-) denotes the space of all positive continuous functions on a space in 
brackets. 

2.2. Theorem. Let k = 1 in V2. 3\) and j2-4\) - Then for any continuous function 
W = W(x\, Xi) on M e the limit 

1 N 

(2.12) Q(W)= lim -]nE x exp(y2w(X( qi (n)),...,X(q t (n)))) 

N— foo iv — 

n=l 

is a convex lower semicontinuous functional satisfying 

(2.13) Q(W)= sup ( [w(xi,...,Xi)dri(xi,...,xt)-I(ri)) 

v eV(M e ) J 

where V(-) is the space of probability measures on a space in brackets considered 
with the topology of weak convergence. 

Furthermore, for any closed set K C V(M ), 

(2.14) lim sup — In P{( N G K} < — inf I(r)) 
and for any open set U C V(M l ), 

(2.15) liminf — In P{Cn G U} > - inf /(rj). 
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Next, we exhibit continuous time versions of the above results. Here we assume 
that X(t), t > is a Markov process on a Polish measurable space (M,B) such 
that for some to > 0, a constant C > and a probability measure v on M the time 
to transition probability P(to, x, ■) of the above Markov process X satisfies 

(2.16) CT V(G) < P(t , x, G) < Cv{G) 

for any x £ M and every measurable set G C M. Again (see [5]), (|2.16|) implies 
existence of a unique invariant measure fi of the Markov process X which satisfies 
(|2.2|) . Now we introduce positive increasing functions qj, j = 1, £ on R + such 
that for some < ct\ < a 2 < ... < and k < I, 

(2.17) qj{t) = ctjt for j = 1, k 
while for j = k + {,...,£ and any 7 > 0, 

(2.18) lim (qj(t + 7) - g^-(t)) = 00 and lirn inf (gj (7*) - g^-i(t)) > 0. 
We will be interested in large deviations estimates as T — > 00 for 

S T (F) = S T = / T F(X( qi {t)),...,X(q £ (t)))dt. 
Jo 

2.3. Theorem. Let W\(x±, X(), A e (—00,00) be as in Theorem \2.1\ Assume 
that k = 1 in and V2.18\) . Then for any x € M the limit 



(2.19) 



Qcont(Wx) = lim iln^exp( C W x {X{ qi {t)), ...,X{q t {t)))dt) 



exists, it is independent of x and it is differentiable in A. In fact, Q con t(W\) = 
\rtr con t(W\) where r con f(W) is the spectral radius of the semigroup of positive 
operators R^q^W) acting by the formula 

(2-20) R'cont^d (a) = E* {9(X(t))W cont (t)) 

where 

(2.21) W mn t{t) = exp ( y ds J Wx(X(a 1 s),X2,--.,xt)dn(x2)--.dn(x e )). 

Furthermore, set W\{x\, ...,xi) = XF{x\, ...,xi) and define J(u) = J con t(u) by 
with r mn i in place of r. Then for any closed set K C R, 

(2.22) lim sup ^ In St G K] < - inf J(u) 
and /or any open set U C R, 

(2.23) li m i n f Il nP{ I^ T G f/} > — inf J(u). 

T— ¥00 1 1 u£U 

The second level of large deviations in the continuous time nonconventional setup 
deals with occupational measures 

1 " T 
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on M l . Now we assume that X(t), t > is a diffusion process on a compact 
Riemannian manifoid M with the generator L which is a nondegenerate second 
order elliptic differential operator. For any probability measure r/ on M set 

2.25 /conti 7 ? ) = - m £ / 7 \ 

u£D+J M U[Xl,X2,—,Xe) 

where the infimum is taken over all positive u from the domain of L. 

2.4. Theorem. Let k = 1 in and &2.18\) . Then for any continuous function 

W = W(xi, Xi) on M* the limit 

(2.26) 



QcontiW) = lim ^ln^exp ( f W(X( qi (t)), X(q e (t)))dt) = r cont (W) 
is a convex lower semicontinuous functional satisfying 

(2.27) QcontiW) = sup ( W(x u ...,x £ )dr](x u ...,x e ) - I cont (r))). 

i)6P(M«) J 

Furthermore, for any closed set K C V(M ), 

(2.28) lim sup i In P{Ct G X} < - inf I cont (r]) 
and for any open set C T'(M i ), 

(2.29) lim inf 1 In P{(t G 17} > - inf I mnt {rf). 

A similar result holds true when X(t) is a nondegenerate continuous time Markov 
chain with a finite state space. 

Next, we describe our large deviations estimates in a nonconventional averaging 
setup. Here we consider either a difference equation (|1.4[) for S e (n) in the discrete 
time case where X(n), n > is a Markov chain satisfying conditions of Theorem 
0or a differential equation for E c (t) = E%(t) € R d , t > 0, 

(2 - 30) = sF ( E£ (^ X M)),-,X( qi (t))), = * 

in the continuous time setup where X(t), t > is a Markov process satisfying 
conditions of Theorem l2.3[ We assume that F(£, x\, xg) is bounded and Lipschitz 
continuous in £. The setup of (|2.30|) emerges considering, for instance, a time 
dependent small perturbation of the oscillator equation 

(2.31) x + \ 2 x = eg(x, x, t) 

where the force term g depends on time in a random way g(x,y,t) = 
g(x, y, X(qi(t)), X(qe (<))). Then passing to the polar coordinates (r, (j>) with 
x = rsm(X(t — 4>)) and x = Xrcos(\(t — <fr)) the equation (|2.31[) will be transformed 
into (|2.30[) with S £ = (r, (j>) . It seems reasonable that a random force may depend 
on versions of a same process moving with different speeds which is what we have 
here. 

As it is well known (see, for instance, [25J, if F(£, x±, X£) is bounded and 
Lipschitz continuous in £ then whenever for each £ the (pointwise) limit 

1 f T 

F(£)=&* 7p / F(^X( qi (t)),...,X(q e (t)))dt 

T-S-OO / Jr. 
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exists then for any T > 0, 

Urn sup \E £ {t) - E £ (t)\ = 

where 



E_>0 0<t<T/£ 



dt 

In the discrete time case we have to take 

1 N 

F(0= lim -^F^X^n)),...,*^))). 

n=0 

Almost everywhere limits of the averages above can be obtained by nonconvcn- 
tional pointwisc crgodic theorems from [4] and [I], respectively, in rather general 
circumstances in the dynamical systems case and under another set of conditions 
existence of such limits follows from [19] , The next natural step here is to obtain 
large deviations estimates for the above approximation of the slow motion E £ by 
the averaged one E £ . 
For any neV(M e ) set 



(2.32) B n (£)= I B{£,x 1 ,...,x l )d<n{x u ...,x ll ). 

For each absolutely continuous curve -ft, t E [0,T] set 

Sorh) = [ m{{I{r]) : 7t = B v ( 7t )}dt 



where I{rf) is given by (|2.11[) or I{rf) = / con t(??) given by (|2.25p in the discrete or 
continuous time cases, respectively. If -f t , t € [0,T] is not absolutely continuous we 
set .Sot (7) = 00 • 

2.5. Theorem. Let k = 1 in l2~3\) and |yp or in [2~T7\ j and l2~W\) and set 

i& E (t) = E £ ([t/s]) or \E' e (t) = E £ (t/e) in the discrete or continuous time cases, 
respectively. Then for any continuous function Wt(xi, —,xt) on R + x M , 
(2.33) 

hm eln^exp^- 1 f W t (X( qi (t/e)), ...,X(q t (t/e)))dt) = f r cont (W t )dt 
Jo Jo 

where r con i is the same as in Theorem \ 2. 3\ with Wt considered as a function on M l 
and in the discrete time case we either extend qj(t) — qj([t\) to all t > in order to 
write the integral in exponent in H2.3S} ) or replace this integral by the corresponding 
sum. 

Furthermore, for any a, 5, A > and every continuous jt, t G [0,7~], 70 = x there 
exist So > such that for all positive e < Eq, 

(2.34) P{po, T (K,l) < $} > cxp{-i(5 ,r(7) + A)} and 

(2.35) P{ Po ,r(n, $S,rW) > <$} < ex P {-J( a - A)} 

where ^%{0) = x, po,T * s the uniform distance and $q T (x) = {7 : 70 = 
x, S , T (l) < a}. 
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2.6. Remark. Suppose that the averaged motion S 5 has several attracting fixed 
points and limit circles. Then similarly to [12] (Markov chains case) and [17] (dy- 
namical systems case) we can study rare transitions of the slow motion S e between 
these attractors. However, in the nonconventional setup the situation is more com- 
plicated and this problem will not be dealt with in this paper. 

Certain versions of Theorems 12.2112.51 can be obtained for some classes of dy- 
namical systems such as mixing subshifts of finite type and C 2 hyperbolic and 
expanding transformations but in order not to interrupt probabilistic exposition 
here we discuss some of these results in the last Section [3] 

In the next section we will show that the study of large deviations in our non- 
conventional setup can be always reduced to the case k = i, i.e. we have to 
deal only with qj(n) = jn, j = l,...,fc. So we discuss next this situation al- 
lowing any k > 1 while assuming that X(n), n > 0, qj and F are the same 
as in Theorem 12.11 It turns out that the treatment of the general case when 
X(0),X(1),X(2), ... is a Markov chain requires a quite complicated and technical 
proof whose exposition here would make this paper too long, and so it will be 
discussed in another paper. Thus, we will restrict ourselves here to a particular 
case when X(n), n > are independent identically distributed (i.i.d.) random 
variables (or vectors). Namely, we are interested in large deviations estimates for 

S N (F) = J2n=i F ( X ( n )i X ( 2n )>---> X ( kn )) whcrc X ( n ) e M > n ^ 1 arc LLd - ran- 
dom variables (vectors) with a compact support M. Let n, r m > 2 be all primes 
not exceeding k. Set A n = {a < ft \ Q, is relatively prime witliri, ...,r m } and 
B v (a) = {b < r) : b — ar 1 1 r 2 2 ■■■rf^ for some nonncgative integers d\, ...,d m }. 
Now for any bounded measurable function V on M k we write 
(2.36) 

Sn(V) = S N ,a(V) with S N , a {V) = ]T V{X(b),X(2b),...,X(kb)). 

a<£A N beB N (a) 

Observe that <SW,a(^0, a G Ay is a collection of independent random variables. 

2.7. Theorem. For any continuous function V on M k the limit 

(2.37) Q(V) = lim^oo i m E exp ( ^Li V(X{n), X(2n), X{kn))) 
= limjv^oo J2aeA N foEexpS Nia (V) 

exists and the functional Q(V) is convex and lower semicontinuous. If V = V\ 
depends on a parameter A and has a bounded derivative in A then Q(V\) is also 
differentiate in A. Thus taking V\ = XF we obtain that also for k > 2 in the 
above i.i.d. setup both upper and lower large deviations bounds 12. 9\) and 12.1(A ) 
hold true with the rate functional J being the Fenchel-Legendre transform J(u) = 
sup A (Au-Q(AF)) ofQ. 

In Section [4] we will provide a rather explicit computation of the limit (|2.37[) . As 
a model application of Theorem 12 . 71 we can consider digits X(n) = X(n,uS), n > 1 
of base M expansions cj = YH7=i X ^m^ > x ( n i Cl ') G {0, 1, M — 1} of numbers 
oj G [0,1) which are i.i.d. random variables on the probability space ([0, 1),£>, P) 
where B is the Borel c-algebra and P is the Lebesgue measure. Take, for instance, 
V(xi,...,Xk) = S aiXl S a2 x 2 ■ ■ ■ Sa k x k for some ai,...,ak G {0, 1, M — 1} with 5y = 
1 if i = j and = 0, otherwise. Then Theorem 12 . 71 provides large deviations estimates 



Nonconvcntional large deviations 



9 



for the number 

(2.38) n au ... iak {N,u) = #{n < N : X(n,u) = a 1 ,X(2n,u) = a 2 , 

...,X(kn,u) = a k } = En=iV(X(n,Lo),...,X(kn,u:)). 

The same setup can be reformulated in the following way. Consider infinite se- 
quences of letters (colors, spins, etc.) taken out of an alphabet of size M . Let 
n ai ,...,a k {N) be the number of arithmetic progressions of length k with both the 
first term and the difference equal n < N and having the letter (color, spin, etc.) 
on on the place i = 1,2, ...,k. Then Theorem 12 . 71 yields large deviations bounds for 
n ait ... t a k {N) as N — > oo considered as a random variable on the space of sequences 
of letters with any product probability measure, in particular, with uniform proba- 
bility measure which assigns the same weight to each combination of n consecutive 
letters (i.e. to each cylinder set of length n) for all n = 1,2, .... We observe that an- 
other statistical physics interpretation of a particular case of the above i.i.d. setup 
appeared independently in a recent paper [6] though large deviations bounds were 
obtained there only for the case k = M — 2. 

3. Large deviations for Markov processes: k = 1 case 

3.1. Reduction to the k = £ case. First, we will show that the study of the limit 
(|2.6[) for any k < i can be reduced to the case k — £. In order to apply this result 
not only to Markov chains but also to other fast mixing processes, in particular 
to dynamical systems considered in Section we will deal here with a somewhat 
more general setup. 

Let {X(n), n = 0,1,...} be a sequence of measurable mappings of a measurable 
space (f^,^ 7 ) to a Polish space M considered with its Borel cr-algebra B. Since 
(M, B) is isomorphic to a Borel subset T of an interval we can and do identify M 
with T and assume that each X(n) is real (or vector) valued. Then {X(n), n = 
0, 1, ...} becomes a real (or vector) valued stochastic process under each probability 
measure on (fi, F). Our setup includes two such measures P and II while we assume 
that X(n)H = fi does not depend on n, i.e. that the one dimensional distribution 
fx of X(n) on the probability space (f2, F, II) is the same for all n. In order to state 
our conditions we introduce also a family of er-algebras J- m i C J- , — oo < m < I < oo 
satisfying J-"_oo,oo = J~ and T m i C J- m 'i' if m 1 < m and I' > I. Next, we define a 
modified V'-mixing (dependence) coefficient by 

ip(n) = ipp,n( n ) = su Pi>o,g{W Ep (9\F-oo,i) - E n g\\oo ■ 
g is J-i+n^oo — measurable and -En|<?| < 1} 

where Eq is the expectation with respect to a probability measure Q and || • is 
the L°°(Q,, P) norm. The rational behind introduction of two probability measures 
P and n above is to allow X(n), n > to be a Markov chain with an arbitrary 
initial distribution (in particular, starting at a point) under P while X(n) is sta- 
tionary under II and the distribution of X(n) under P converges to fi = X(0)T1. 
Furthermore, we will not assume measurability of X(n)'s with respect to some of 
cr-algebras T m ^x but instead will rely on approximation coefficients defined for each 
bounded continuous function V = V{x\, ...,xi) on M e by 

Pv{n) = ^^\<3<i^V x ^... >Xj _ uXj+u ... >Xl eM s ^V m >o \\V(xi, ...,x -i, 
X(m),Xj+i, ...,Xi) - V{xi, ...,x j -i,E P (X(m)\J r m 
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Since V is continuous we can take here the supremum over a countable dense set 
in M f_1 , and so outside of one P-measure zero set /3y(n) gives a uniform bound of 
the difference above. 

3.1. Proposition. Let V(xi, xi) be a bounded continuous function on M i and 
assume that 

(3.1) lim (Mn) + Pv{n)) = 

n— too 

together with the conditions H2.S\) and \2.J$ on functions j = 1, ...,£. Then, 

(3.2) limjv^oo i ( In E P cxp ( £^ =1 V{X( qi (n)), *(«<(«)))) 
- \nE P cxp ( E^Li ^W(X(n), X(2n), .., X{kn))))) = 

where for each m < £, 

(3.3) P r W( ) = In J M ... f M exp(V(xi, ...,x m ,x m+ i, ...,x e )) 

d[i{x m +i)...dii(xt ) and = V. 

If in fact, X{n) is T n ^-measurable then VS. 2^ holds true for any bounded measur- 
able function V assuming only that ^(n) — > as n — > oo. 

Proof. Observe that (|2.4p yields 

(3.4) lim (qj(^n) — <7 ? _i (rt)) = oo for any j > k and 7 > 0. 
Set 

d y (n) = min min (q 3 {^n) - qj-i(n), imn(qj(l) - q 3 {l - 1))) 

k+l<j<£ l>-yn 

and observe that d 7 (n) — > 00 as n — > 00 in view of (|2.4[) and (|3.4[) . For any 
/ = 0, 1, ... and < r < 00 set 

X P (0=£p(A-(Z)|^i_ P , I+r ). 

Next, for m = 1, 2, £, a < b < c and < r < 00 denote 

Z r m) (a,6,c) = £;pex P (E a</ < b ^ (m) (^(gi(0),-,^(g m (0)) 
+ E6< ; < c ^ (m - 1) (^'(gi(0),- I ^(? m -i(0)))- 

If 6 = c, i.e. we have only the first sum above, we set z\ -(a, b, c) = Zf'{a, b). If 
r = 00 we drop the index r and write just (a, b, c) or Z^ (a, b). Observe that 

(3.5) e~ c ^ N Zl m \ 1 N, N) < Z^ m \o, N) < e c{ - v ^ N Z^^N , N) 

where C(V) = sup^^ i jI( ) eM ( |V^(xi, ...,xg)\. By the definition of fSyin) (and the 
remark after it) we obtain also that for any m = 1, 2, £, a < b < c and < r < 00, 

(3.6) Z( m )(a,&,c)e- (c - a)£ ^ (r) < Z {m \a,b lC ) < Z^ m \a, 6, c )e {c - a)mv{r) . 

Let g = g(x,y) be a bounded measurable function on a product Mi x M2 (for 
some measurable spaces (Mi, Si) and (M2, $2)) and X : SI — > Mi and Y : 17 — s- M2 
be T-ozj— and J^+n.oo - measurable random variables (maps), respectively. Then 
it follows from the definition of i/'M = il>p,n{n) that 

(3.7) lEpigiX^T^j) - g n (X)\ <^(n)\g\ n (X) 
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where gn{x) = Eng(x,Y) and |g|n(aO = En\g(x, Y)\. Now take r = r 7 (N) = 
[id 7 (iV)] where [•] denotes the integral part. Then for all N > n > jN + 1, 
m = k + 1, ...,£ and N large enough 

(3.8) Z^ m) ( 7 N, n, N) = E P (jexp ( E 7 Ar</<„-i V^(X r ( qi (l)), X r (q m (l))) 

+ E„< i <iV^ (m - 1) (^r(?l(0):- ! ^r(^-l(0))) 

where 

J= J r (n)=Ep(exp (W m )(X P (gi(n)),...,Jr r (g m (n))))|J : -_ 00 ,, m(n _i) +r ). 
By (|3.7p and the definition of /3y we conclude that 

(3.9) \J - Je X p(V( m \X r (q 1 (n)) 1 ...,X r (q m ^ 1 (n)),y))d f i(y)\ 
< ?? (r)/exp (y(")(X r ( (?1 H),...,X r ( gm _ 1 (n)), 2/ ))^(y) 

wherein) = (7/)(n) + 2/3y(n) + 2/3y(n)^(n))e c(y) as n — > oo . Employing (|3.8I) 
and (3j)| for n = N,N — 1, [77V] + 1 we obtain that 

(3.10) (l-77(r)) Ar Z(" l - 1 )( 7 iV,^) < Z< m )( 7 iV,iV) < (1 + v {r)) N Z^-V^N, N). 

Next, we use (|3~TU|) for m = £,£ - 1, k + 1 which together with ((331) and (|5TB|) 
yields that 

(3.11) { i_ r}{T y ) m e -2N{t0 v {r)+C{V)j) Z {k)^ N ^ < Z W( 0;iV ) 

< (1 + i 1 (r)) eN e 2N ^^+ c ^Z^(0,N). 

Taking In in p. lip , dividing by N, letting N — > 00 and taking into account that 
then r = r(N) — > 00, we obtain that 

lim sup — I In Z w (0, N) — In Z (fe) (0, N) I < 2C(V) 7 



and (|3.2p follows since 7 > is arbitrary. 

If X(n) is J^ n-measurable for each n then we do not have to deal with the 
approximation coefficient /3y(r) and X r = X, Z^ 71 ' = Z^ above. Hence all above 
arguments remain true with /3y(r) = for any bounded measurable V and we 
obtain (|3.2p provided ip( n ) -> as 11 -> 00. □ 

It is easy to check the conditions of Proposition 13 . 1 1 for Markov chains X(n), n > 
satisfying the "strong" Docblin condition (|2.1[) . Indeed, denote by Ti. m , I < m 
the cr-algcbra generated by X(l), ...,X(m) with J-/ j00 being the minimal er-algcbra 
containing all !Fi tm , m > I and we set T\^ m = J-Q. m for I < and m > 0. If 5 is 
J~i+ n , 00— measurable then by the Markov property 

(3.12) E P (g\F^ l ) = J P{n,X(l) 1 dy)E Py g 

where P y is the probability measure on the path space of the Markov chain X(n) 
starting at y. The Chapman-Kolmogorov equation sais that for any n > no-, 



P(n, x,G) = j P(n - n , x, dy)P(n , y, G), 

and so by (|2.ip for all such n, 

C-V(G) < P(n,x,G) < Cv{G). 



12 



Yu.Kifcr and S.R.S.Varadhan 



This together with the Radon-Nikodim theorem yields existence for ^-almost all y 
and n > no of the transition density p(n, x, y) satisfying 

C- 1 <p(n,x,y)= dP{7 ^ Xr) (y)<C. 

It is well known (see, for instance, [5]) that (|2.1[) and (|2.2I) imply that 

(3.13) (1 - Ke- Kn )p(y) < p(n, x, y) < (1 + K e - Kn )p(y) 

for some K, n > independent of n > uq. If II is the stationary probability of the 
Markov chain on the path space then 

E ng = J p{y)E Py gdv(y). 

Hence, by (j3~T2"|) and ([3TT5]) . 

||£p(ff|.F-oo,j) - SnffHoo < A"e- KTl £;n|.9|. 
Thus the condition (|3.1|) with (3v{ n ) — is satisfied in our Markov chains case. 

3.2. Corollary. Assume that conditions of Proposition 3.1 hold true. Suppose that 
for any bounded measurable function V\(xi, ...,£&) on R x M k having a bounded in 
xi, Xk derivative in a parameter A G (— oo, oo) the limit 

1 N 
Q{V X )= lim -lnE x exp(y2v x (X{n),X(2n),...,X(kn))) 

n=l 

exists, it is a lower semicontinuous convex functional and it is differentiable in the 
parameter X. Then for any bounded measurable function W\(xi, xi) onRx M 
having a bounded in X\, ...,xi derivative in a parameter A G (—00,00) the limit 

1 N 

Q(W X )= lim -ln^exp(V^ A (X( gi (n)),,...,X( % (n)))) = Q{W { x k) ) 

N— ¥00 1\ L — * 

n— 1 

exists, it is a lower semicontinuous convex functional and it is differentiable in the 
parameter X. In particular, the large deviations estimates in the form \2. 9\) and 
\2.10\) hold true then with the rate functional J given by \2.8\) with W\ = XF . 

Proof. By Proposition EU Q(W\) = Q{W ( x k) ) and we see from ((373]) that if W\ 

(k) 

is bounded and has a bounded derivative in A then so does W x . Hence, by the 
assumption Q(W X ) is a lower semicontinuous convex functional and it is differen- 
tiable in A which implies the same for Q(W\) and the result follows. □ 

Now let k = 1 and V = W\ as in Theorem [2TJ Then W x = and by 



Proposition 13. 1[ 

1 N 

(3.14) Q(W X ) = lim -ln^exp( Vw A (*(n))). 

n=l 

Thus we arrive at the standard limit appearing in "conventional" large deviations 
results which is well known for Markov chains X(n), n > satisfying our condi- 
tions as it is described in Theorem 12.11 Differentiability of Q(W\) in A follows 
from standard results on positive operatos (see, for instance, [5D]) and we derive 
now Theorem 12.11 from well known "conventional" large deviations results (see, for 
instance, [S], [TB] and Section 2.3 in dJ). □ 
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3.2. 2nd level of large deviations. Recall that in the setup of Theorem 12.21 we 
have k = 1, M being a compact space and the result is about large deviations for 
occupational measures (jv appearing there. Let W be a continuous function on M l 
with W defined by (|2.5[) . By Proposition 13.11 together with the well known facts 
(see, for instance, [5], [TT] and [IS]), 



(3.15) Q(W)= hm^hiE x exp(Y i W(X(q 1 (n)),...,X(q e (n)))) =ln(r(W)) 

n—1 

where r(W) is the spectral radius of the operator 

(3.16) R(W)g(x) = E x (g(X(l))W(X(l))) = E x (g(X(l))e l ^ x ^). 
Observe, that by the Donsker-Varadhan variational formula (see [9] and [10]), 



(3.17) Q(W)= sup (/ lnW(x)du(x) - I(u)) 

veV(M) J M 

where I(y) = — inf U ec + (M) J l n ^^T^ft^ dv(x) and the infimum is taken over pos- 
itive continuous functions on M . 

Next, let yW(n), i = 2,...,£; n = 0,1,2,... be i.i.d. M-valued random variables 
with the distribution //, all of them independent of the Markov chain X(n), n > 0. 
Then it is easy to see that 

1 N 

(3.18) lim -ln£ x exp( Vn/(I„,F( 2 '( n ),..,y("(i l ))) = Q{W). 

n—1 

Indeed, let Fx be the a- algebra generated by the Markov chain X(n), n > 0. Then 

(3.19) E x exp ( W(X(n),Y^ (n), F< £ ) (n))) 

= S x (S x (fflq)(E^i W(X(n),y( 2 )(n),...,yW(n)))|J-x)) 
= ^cxp(El 1 ln#(X(n))) 

and (|3.18[) follows. But now we have the standard situation for the Markov chain 
(X(n), Y^> (n), (n)), n > 0, and so by the Donsker-Varadhan variational 
formula (see [9] and |10j). 

(3.20) Q(W)= sup ( fw(x 1 ,X2,...,xt)di/(xi,...,x i )-I(v)) 
where 

( 3 - 21 ) = - inf«ec+(Mx...xM) Imx-xM 

, J«(I(l),H,.,I,)4(l;)...li|l(l() , , v 

m «(xi,...,3:«) OI^Xi, ...,Xl). 

It is known here (see, for instance, Proposition 5.1 in |15j ) that there exists a 
unique ^ = vw on which the supremum in (|3.20[) is attained and it follows from 
the standard theory (see, for instance, [16]) that I{y) is the rate functional for the 
second level large deviations both for the auxiliary occupational measures 

1 N 

5 {X n ,Y^,...,Y^) 

and for our nonconventional occupational measures (jv ■ D 
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3.3. Continuous time case. Similarly to the discrete time case, the main step in 
the proof of Theorem l2.3l is to establish (|2.19[) and to identify the limit there as the 
spectral radius of the semigroup (|2.20[) . 

From (|2.16|) it follows that for any t > to and every measurable set G C M, 

(3.22) P(t,x,G)= [ p{t,x,y)dv{y) with C*" 1 < p(t, x, y) < C. 



Furthermore, similarly to (|3.5[) (see [8]), 

(3.23) (1 - Ke' Kt )p(y) < p(t, x, y) < (1 + Ke~ Kt )p{y) 

where p(y) = -^(y) is the density of the unique invariant measure fx of the Markov 
process X. Observe that (|2.18[) implies also that for any j > k + 1 and 7 > 0, 

(3.24) lim { qj {jt) - <&_i(i)) = 00. 

t— >oo 

Let V = V(xi, ...,xi) be a bounded measurable function on M e and for m = 
1,2, ...,£ set 



^cont^ 1 ' •■■' Xm ) = / ■•■ / V(xi,...,x m ,x m+1 ,...,xt.)d{i(x m+ i)...d{i,(xi) 



with V^ nt = V. Set t„( 7 , T) = 7 r + ? i.( 7 + 7 2 ) for n = 0, 1, 2, M( 7 , T) - 1 where 
M( 7 ,T) = [(T(l - 7 )/( 7 + 7 2 )]. Next, for a < b < c and m = 1,2, ...,lwe denote 

Zt ] (a, b, c) = E x exp ( £ a <„< b S^p^ vg^ (X( qi (t)), X{q m {t)))dt 

+ E b < n<c St:^ ^(8m-l(*)))dt) 

and set z[ m \a,b) = Z x m) (a,b,b). Observe that Z { J } (0, A/ (7, T)) does not contain 
the integration from to 7T as well as the sum of integrals from t„( 7 , T) + 7 
to i n ( 7 , T) + 7 + 7 2 which are both present in the integral from to T, and so 
estimating these missing parts we arrive at the inequality 

(3.25) exp(-2C(y) 7 T)^ ) (0,M( 7 ,T)) < E x exp ( V(X(qi (t)), 

...,X(q e (t)))dt) < exp(2C(V) 1 T)Z x e \0 1 M( 1 ,T)) 

where C(V) = sup {xu ^ xe) \V(xi, ...,x e )\. 

Denote by JF t the cr-algebra generated by X(s), s < t. Then by (|2.18p 
and (|3~2"H) for all T large enough if n > 1, T > t > t n (j,T) and 
s < t„_i(7,T) +7 then X( gi (s)), X(q m (s)) and X(g 1 (t)),...,A'(g m _ 1 (*)) arc 
•Fq m (tn (7, T)-7 2 ) —measurable. Hence, 

(3.26) Z x m) (0, n, M( 7 , T)) = E x (J m , n exp ( E <,<„ f$$~* 

V ^nt( X ^)l ...,X(q m (s)))ds + E n+ i<KM( 7 ,T) J^T)^ 

^(m-l 
cont 



where 



rt„ (7,T)+ 7 

4,n = £ , a; (exp( / ^"7 (X(qi(s)),...,X(q m (s)))t/s)|j'g m(tri(7iT )_ 7 2 ) ). 
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Let 

rt n (~f,T)+~, 

J m ,n = cxp( / E x (V^™>JX(qi(s)), ...,X(g m (s)))|j r gm(tn(7!T) _ 7 2 ) )ds). 

Since \e a - 1 - a\ < a 2 if \a\ < 1 then 

(3.27) \Jm,n-J m ,n\ < 2^/ 2 (C(V)) 2 . 

On the other hand, by the Markov property 

(3.28) J m>n = exp ( ds Sm p(<1™{s) - g m (t n (7, T) - j 2 ), 
X( gm (U7,T)- 7 2 )),y)^ t W^^ 

Set 

dy(t) = inf min rnin(gj(s) - tfo_i(s7 _1 ), g,(s) - g iv s - 7 2 )) 

s>7i fc+l<j<£ 

and observe that d 7 (t) — > oo as £ — > oo for each fixed 7 > in view of the assumption 
UHTSJ). Now, by and (|3~^5j) . 

(3.29) exp(-^e-^OTC(y)7) < J m ,„ exp ( - J^g^ ^^(^(^(s)), 

...,X(g m _i(a)))dfl) < cxp(i i "e- Kti -'( T )C , Cl/)7). 

Employing (|3^6| - ([3~29l) for n = M(j,T), Af( 7 ,T) - 1, 1 with each m = 
(,<-l,...,fc + l we obtain that 

(3.30) limsupi|ln(zW(0,Af( 7 ,T))) -ln(zW(0,Af( 7 ,T)))| =0. 

Now taking In in (|3.25[) and letting first T — > 00 and then 7 — > we obtain from 
(|3.30[) and the definition of that 

(3.31) lim T ^oo ^ ( In E x exp ( V(X( qi (t)), X(q ( (t)))dt) 

- lnE x exp ( / T ^ t X(a fc t))dt)) = 0. 

If k = 1 then i of the second expression in brackets in p.31[) converges asT^co 
to the logarithm of the spectral radius of the semigroup of operators -K^ont^) 
defined in (|2.20[) . Thus, the assertions of Theorems 12.31 and 12 .41 follow from the well 
known results on large deviations (see [S] , [TU] , [H] and [TT] ) in the same way as in 
the discrete time case. □ 

3.4. Nonconventional averaging. According to [12] the large deviations esti- 
mates (|2.34p and ([2.35P follow once we establish (|2.33[) for all continuous functions 
Wt(x\, xi) on R_|_ x M e . First, we claim that even without the assumption k = 1, 

(3.32) lim^ e(ln^ exp (e" 1 / Q T W t {X ( qi {t / e)) , ...,X(q e {t/e)))dt) 
-lnE x e X p(e- 1 £w t ik \x(q 1 (t/e)) 1 ...,X(q k (t/e)))dt))=Q 

where in the discrete time case qj's are extended to all s > by writing qj(s) = 
qj ( [s] ) and we set 

Wt(xi,...,Xk)=ln ••• / exp(Wt(xi,...,X£.))dfj,(xk+i)—dfi(x£) 
Jm Jm 

while in the continuous time case we set 

W} k \xi, ...,x k ) = / ... / Wt(xi,...,xe)dfj,(x k +i)--.d^(xe). 
Jm Jm 
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The proof of ()3.32j) is the same as the proofs of (|3.1|) in the discrete time case and 
of p.3ip in the continuous time case while the dependence of Wt on t does not play 
any role in the arguments employed there. 

Next, when k = 1 we arrive at the "conventional" setup and (|2.33|) follows in 
the same way as in [12] (see also [17]). □ 

4. Large deviations for any k > 1: i.i.d. case 

Here wc assume that X(n), n > 1 are i.i.d. random variables (vectors) and 
rely on the decomposition (|2.36p . In view of independency of Sjv,a(V) for different 
a G An we can write 

N 

(4.1) Z N (V) = Ee X p(J2V(X(n),X{2n),...,X(kn))) = ]J Z N , a (V) 
where 

Z v . a (V)=Ecxp( V(X(b),X(2b),...,X(kb))) 

6e_B„(a) 

with An and B n (a) defined in Section [5] 

In order to study Zjv>(V) we introduce also 

B(a) = {b > 1 : b = arf 1 ^ 2 ■ ■ ■ r^" for some nonnegative integers d\, dm}- 

Observe that each I = 1,2,..., k can be written uniquely in the form I = 
rf r% ! • ■ ■ ryUr for some nonnegative integers d\(l), ...,d m (l). Now, if b = 
or^-.-r*" G B(a) and Z = 1,2,..., k then Z6 = ari 1+dl(0 ■ • • r 't +dm{l) e B(o). 
Next, consider the lattice Z m and set 

Z" 1 = {n = (ni, n m ), n,i > for all £ = 1, m}. 

Then the formula tp a (rii, n m ) = ar™ 1 ■•■r m m provides a one-to-one correspon- 
dence 

ip a : Z™ -> B(a) 
where, recall, a is relatively prime with n, r m . Set 

-D(p) = {n = (ni, n m ) £ Z m : n.j > 0, i = 1, m and rij In < p}. 

i=l 

Then, clearly, 

(4.2) <p a D(ln(N/a)) = B N (a). 
It follows that 

m 1 N 1 N 

(4.3) 1^(0)1 < n (i + r- to -) < (i + ro ln -)™ 

- 1 -■- In a In 2 a 

2=1 

where |T| denotes the cardinality of a set T. Hence 

(4.4) a<N2~^ BN( - a ^ 1/m - 1 l 

Next, we claim that Zn,o,{N) is determined only by \Bn{o)\ and not by N and 
a themselves. Indeed, since |-D(p)| is nondecreasing in p then it determines the 
set D{p) itself, and so \D(\n(N/a))\ = \Bn(o)\ = \B N / a (l)\ determines the set 
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Bjv/ a (l) in view of ()4.2j) . Set B v (a) = B v (a) U {n : n = In' for some n' G B v (a) 
and I = 2,3, k}. Then we can write 

Z v ,a(V) = J ... J cxp ( ^2 V(xb,X2b,—,Xkb)) Y\ dn{xv). 

b€B n {a) b>£B„(a) 

It is easy to see from here that Z v ^ a (V) = Z j; / a l (V) for any r\ > and an integer 
a J> 2 relatively prime with ri, . .., r m . Indeed, Z VM (V) is determined by the labeled 
directed graph T n (a) having B^a) as its vertices and having arrows of k — 1 types 
so that an arrow with a label I = 2,3, k is drawn from n G B n (a) to n' G B n (a) 
if n' = Zn. Clearly, the graphs T r; (a) and r, ; / a (l) are isomorphic in the sense that 
there exists a one-to-one map tp : B n {a) — > B v / a (l) such that if n,n' G B n (a) 
and n' = In then <pn,(pn' G B n / a (l) and 9371' = Z<y9n. Since X(n), n> 1 are i.i.d., 
Zr),a{V) is determined, in fact, by the isomorphism class of r r/ (a) and not by T^(a) 
itself, and so Z, ha (V) = Z n / a l (V). Since |-Bjv(a)| determines the set B N / a (l) we 
conclude that it determines Z^ ya (y), as well, proving the claim. 

Let I = \Bpf(a)\ and set Ri(V) = Zn^(V) since the latter depends only on I 
(and, of course, on V). Observe that 

(4.5) \rRi{V) < IC(V) 

where C(V) = sup Xi Xk£M \V (xi, ...,x k )\. Set A% = {a G A N : \B N (a)\ = I}. 
By 63), 

(4.6) \A% ) \<N2~^ /m - 1 l 

Observe that |-D(/j)| is a nondecreasing right continuous picccwise constant function 
and since r±,r2, ...,r m are primes the jumps of |-Djv(p)| can only be of size 1, i.e. 
for all p > 0, 

\D(p)\-tim\D(p)\ < 1. 
pTp 

It follows that 

Pnnn(l) = 'w£{p > : \D(p)\ = 1} and p max (l) = sup{p > : \D{p)\ = 1} 

are well defined for each integer I > 1 and p m , n (l) < Pmax(0- Denote = {a G 
N : Ne^ Pcaax ^ < a < 7Ve~ Pmi "( i - ) , a is relatively prime with r%, r<i, r m }. Then 
by (|4.2p and the above, 

(4.7) -l^ivll < 4^ 0asAr ^°°- 



We will show next that the limit 

1 



(4.8) Jim = (e-^ to « - e -Pm a x(()) r 

exists with 

, « 111111 1 , 1 

(4 - 9)r = 1 -2-3 + ^-5 + ^ + 3^-2^5 + '-- + ( - ir V 1 .. 2 ..., m - 

Indeed, for each integer n > 1 set G(n) — {in : i G Z + } and G}j(n) = {j G G(n) : 
jy e -Pmax(() < j < jVe _f>min ^}. Then (by the inclusion-exclusion principle), 
(4.10) 

1^1 = 1^(1)1- |C^(2)| - |C^ ) (3)H-|G'^(2.3)H hC-l)" 1 !^^^! - - -^ rri )|. 
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Since each G(n) is an arithmetic progression with the difference n we obtain that 

(4.11) Jim i|G«(n)| = I( e -^W - e -«— «) 

iv— Hoo JV 7i 

and g^l-gU) follows from (|4"TU ]) -(|4TTT p . 

Observe that by (|4.2p , (|4.3p and the definition of p m j n and p max , 

(4.12) Pmax (0 > p min (0 > (l l ' m - 1) In 2, 
and so we obtain from (|4.1j) and (|4.5j) ~ (|4.9j ) that 

(4-13) i In Zjv(y) = i J2 aeAN ^ Zn,JY) 

= Tf El<i<(l + T ijlnf )« I^JV I ^/(V) 

— ► r E;Si( e " /,miD(0 - e-^-W)lnflj(F) as AT -> oo 

while the last series converges absolutely in view of (j4.5[) and (|4.12p . Furthermore, if 
V = V\ depends on a parameter A in a differentiate way with a derivative bounded 
by C then each In Ri(V\) is also differentiable in A with a derivative bounded by 
CI. Hence, in this case we can differentiate in A the series in the right hand side of 
(|4.13j) and the assertion of Theorem 12 . 71 follows . □ 

4.1. Remark. Arguments of the present section yield also moderate deviations 
estimates for sums Sn(V) given by (|2.36|) in the above i.i.d. setup. Namely, let V = 
/ V(xi,X2, Xk)d^jL(xi)d^(x2) ■ ■ ■ dfx(xk), where \i is the probability distribution of 
X(l), and observe that V = EV(X(ri),X(2n), ...,X{kn)) for any n>l. Then for 
any « € (0, |), 

(4.14) \imsup N 2 *- 1 In PIN*" 1 S N {V ~V) G K} < —A inf u 2 
for any closed set K C K and 

(4.15) liminf A^ 2K " 1 lnP{A^ K ~ 1 S' A r(T/- ?) e t/} > --A inf u 2 

N—yoo 2 «e(7 

for any open set U C K provided that for any Agl, 

(4.16) lim N 2 *- 1 In E Gip{\N- K S N (V ~V)) = -A^X 2 

7V->oo 2 

(cf. [H] and [H]). In order to compute the limit (|4.16|) we observe relying on 
the same arguments as above that viiV) = E(Sn,o,(V — V)) 2 depends only on 
I = \Bn(o)\ and on V where, recall, Sn.o, was defined in (|2.36[) . It follows that 

(4.17) \nZ N , a (XN- K {V - V)) = ]^X 2 N~ 2k vi{V) + O ( | A | 3 7V~ 3k 1 1 1 3 Z 3 ) 

provided \B^(a)\ = I. Then in the same way as in (|4.13p . 

(4.18) hmw^oo N 2 *- 1 \nZ N (XN- K {V - V)) 
= iA 2 hm^ 0O 7V- 1 Ei<«(i +T ^ 1 „^ )m \A^\vi(V) 



iA 2 r^~ 1 (e-"™« - e-/w W) Vl (y) 



and (|4.16p follows under a nondegeneracy condition vi(V) ^ whenever ^4^' 7^ 
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5. NONCONVENTIONAL LARGE DEVIATIONS FOR DYNAMICAL SYSTEMS 

In this section wc discuss nonconventional large deviations results in the dy- 
namical systems case and a reader which is not familiar with hyperbolic dynamical 
systems and is interested only in the probabilistic setup may skip this section alto- 
gether. We assume now that T : M — > M is either a subshift of finite type or a C 2 
expanding endomorphism or a hyperbolic diffeomorphism on a compact Riemann- 
ian manifold (see [3] and [21] ). By the latter we mean a C 2 Anosov diffeomorphism 
or, more generally, a C 2 diffeomorphism defined in a neighborhood of a hyperbolic 
attractor. We identify now the probability space (ft,J-,P) with (M,B,fj.) where B 
is the Borel c-algebra on M and fi is a Gibbs T-invariant measure constructed by 
a Holder continuous potential g (see [3] and [21] )• Let k = 1 in (|2.3|) . (|2.4j) and 
(12X71) . (I2~18l) . 

5.1. Theorem. Let X(n) = X(n,tu) = X(n,x) = f(T n x), n > 0, where f is a 
Holder continuous (vector) function, and we take also qj 's as in Theorem ] 2. 11 Let 
k = \ then for any W\ = W\(xi, xi) continuous in Xi, X£, 

(5.1) Q(WA)=limjv^oo^ln J M cxp ( £^ =1 W x (T«^x, T^x))dfi(x) 

= ¥(\nWx+g) 

with W defined by 12. 5\) , g being the potential of [i and *P(-) being the topological 
pressure of a function in brackets for the transformation T (see [3] and [21] ). If 
the derivative dW\/d\ exists and is bounded in xi,...,xe for each X then Q(W\) 
is differ entiable in X, as well. In the expanding and hyperbolic cases the limit in 
H5.1]) remains the same if we integrate in \5.1\) either with respect to the normalized 
Riemannian volume or with respect to the Sinai- Ruelle-Bowen (SRB) measure fi = 
^srb yj^iich is the Gibbs measure corresponding to the potential g = — hup where ip 
is the Jacobian of the differential DT restricted to unstable leaves (see [3] and |21j ). 
The large deviations estimates \2. 9\) and V2.10\) hold true with the rate functional J 
given by \2.8\) with Q(W\) for W\ = XF given by \5.1\) in place of r(W\) in V2.8\) . 

Proof. For T being a C 2 Axiom A diffeomorphism (in particular, Anosov) in a 
neighborhood of an attractor or T being an expanding C 2 endomorphism of a 
Riemannian manifold M (see [3]) let £ be a finite Markov partition for T. Then we 
can take Tki to be the finite a-algebra generated by the partition n'_ fc T 4 £. Another 
case for the above theorem is when T is a topologically mixing subshift of finite 
type, i.e. T is the left shift on a subspace S of the space of one-sided sequences 
? = (q,* > 0),q = 1, Zo such that ? G S if = 1 for all i > where 

S = is an Zo x Iq matrix with and 1 entries and such that S™ for some n is 
a matrix with positive entries. Again, we take [i to be a Gibbs invariant measure 
corresponding to some Holder continuous function and to define J-j-i as the finite 
cr-algebra generated by cylinder sets with fixed coordinates having numbers from k 
to I. The exponentially fast ^-mixing is well known in the above cases (see [3]). In 
fact, convergence to zero of the modified ■i/'^niixing coefficient ipp.n(n) holds true, 
as well, in the hyperbolic and expanding case when P is the normalized Riemannian 
volume and n = fj, SRB . 

If the function W\ = W\(x\, ...,xi) is continuous in x%, xi then f3w x (n) from 
Proposition ^. II tends to zero as n — > oo, and so the condition (|3.ip will be satisfied 
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here. It follows from [16] that 

1 r N 

lim - In / exp ( V W x {f{T n x)))d^{x) = <p(ln W x + g) 

J n=l 

and Theorem 15.11 follows from Proposition 13.11 and Corollary 13.21 considered with 
k = 1 since in our circumstances differentiability of the topological pressure in 
parameters of the potential is well known (see, for instance, [24] and [22]). □ 

5.2. Remark, (i) A version of Theorem 12.21 can also be obtained in the present 
dynamical systems setup where the limit Q(W) = ^P(ln W(x)+g) is obtained in the 
same way as in Theorem 15.11 Since CP(g) is Gateaux differentiable at any Holder 
continuous q (see [215] and [53]) then Q(W) is also Gateaux differentiable at any 
Holder continuous W and the large deviations for occupational measures 

1 N 

CN = Cn.x =mY1 6 (t*iM X „.„TUW x ) 
n=l ' 

follow from Section 4.5.3 in [TT] with a rate function which is the Fenchel-Legendre 
transform of Q. 

(ii) Theorem 12.61 provides a direct application to the dynamical systems case 
when T is a full shift (on a finite alphabet sequence space) considered with a 
Bernoulli invariant measure taking X(n) = foT n with a function / on the sequence 
space depending only on zero coordinate. Nonconventional large deviations when 
k > 1 for more general cases (e.g. subshifts of finite type with Gibbs invariant 
measures, hyperbolic and expanding transformations etc.) require more elaborate 
technique and they will not be treated in this paper. 
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